D.5 EXCEL
Approximate Cost: $140 (Most often bundled with Microsoft Office Suite, cost variable around $300)
Source: Microsoft Store (www.microsoftstore.com/store/msstore/home)
Current Version: 2010 for Microsoft Windows and 2011 for Mac OS X
Operating System Needs: Microsoft Windows or Mac OS X
Input Structure: Enter data into a grid of cells arranged in numbered rows and lettered or numbered columns. Data type varies and includes numeric, text, dates, time, and percentage.
Overview
Excel is a commercial, flexible spreadsheet-based program. This program comes preconfigured to perform several parametricA statistical test that depends upon or assumes observations from a particular probability distribution or distributions (Unified Guidance). statistical tests ranging from t-tests to two-way analysis of varianceThe square of the standard deviation (EPA 1989); a measure of how far numbers are separated in a data set. A small variance indicates that numbers in the dataset are clustered close to the mean. with replications. In the native capability configuration, Excel offers a narrow rangeThe difference between the largest value and smallest value in a dataset (NIST/SEMATECH 2012). of statistical functions that have limited usefulness when evaluating groundwater data. You must use add-ins to evaluate groundwater data.
Excel supports several types of expansion. Limited formula syntax allows you to develop statistical packages. In addition, numerous commercially-available statistical add-ins are available. Programing with Visual Basic for Applications (VBA) allows data manipulations that are difficult to accomplish with the packaged formula syntax.
Disclaimer: Statistical functions and capabilities presented for this software package have not been reviewed or verified by Microsoft.
|
Statistical Method |
Capability As Is |
Capability with Scripts/Add-Ins |
|---|---|---|
|
Handling of NDs |
|
|
|
|
● |
|
|
|
● |
|
|
|
● |
|
|
|
● |
|
|
Exploratory/Diagnostic Tools |
|
|
|
Summary Statistics |
● |
● |
|
|
● |
|
|
|
● |
|
|
Data transformations |
● |
● |
|
Statistical Design |
|
|
|
Statistical Power |
|
● |
|
|
N/A |
|
|
Contaminant ranking |
|
● |
|
|
N/A |
|
|
Statistical Limits |
|
|
|
● |
● |
|
|
|
● |
|
|
|
● |
|
|
Testing Compliance Limits |
|
● |
|
Graphics |
|
|
|
Plots/Charts |
● |
● |
|
Batch plots |
● |
● |
|
Tweaking of graphics |
● |
● |
|
Statistical Comparisons |
|
|
|
● |
● |
|
|
● |
● |
|
|
Spatial Analysis |
|
|
|
Geostatistics/Mapping |
|
● |
|
|
● |
|
|
|
● |
|
|
Regression/Time Series |
|
|
|
● |
● |
|
|
|
● |
|
|
● |
● |
|
|
● |
● |
|
|
|
● |
|
|
|
● |
|
|
Multivariate Analysis |
|
|
|
Multiple regression |
|
● |
|
Factor/Discriminant analysis |
|
● |
|
|
● |
Capability Ratings:
N/A = Not applicable or not available
● = Full capability
◒ = Some capability
(blank cell) = No capability
Add-Ins Available
Available through Microsoft and other commercial entities. Some examples include Analysis ToolPak for Excel 2010 (www.microsoftstore.com), xlstatistician (www.xlstatistician.com), and XLStat (www.xlstat.com). Examine add-ins carefully to confirm that they contain the statistical procedures of interest.
Ease of Use and Data Import
You can enter data into Excel in a variety of ways ranging from simple keystroke to importing data contained in databases like Access. Standard rules of relational database development must be used if the data are going to be imported from an external database. For example, the fields can be delimited in a variety of ways, such as tab-delimited or comma-separated values (CSV), but must have specific field names. Each groundwater measurement must occupy one record of the input text file.
Types of Distributions
Excel accepts data of any distributional type. You can apply data transformations within Excel. Statistical procedures within the native capability of Excel are parametric.
Visualization
The native capability of Excel includes basic built-in graphics for data visualization (such as scatter plots, histograms, and line plots). You can alter the graphics formatting.
Primary Uses for Groundwater Data Analysis
Excel is a commercial spreadsheet application created as a general but flexible data analysis tool that can be applied across a broad range of disciplines (for example, in business, engineering, finance, and science). The native capability of Excel is not specifically tailored for statistical evaluation or optimization of groundwater monitoring networks. However, add-ins and custom code creation (in VBAVisual Basic for Applications) allows you to create highly tailored statistical functions that can be used to evaluate or optimize groundwater monitoring networks. Excel is easy to use for simple statistical applications, ubiquitous and compatible with other Microsoft Office applications such as the Access database tools. Data and results can be easily exported from Excel into other applications such as ESRI geographic information system (GIS) tools.
Benefits
- Excel is a widely used spreadsheet application. The in-program help function is intuitive and the learning curve is short for native capability functions. Excel does not require advanced training or a specific statistical skill set.
- Many websites provide help for add-ins or for creating specific applications through the Excel-specific formula syntax.
- Excel is convenient for data entry and manipulating rows and columns of data.
- Excel can be used for rapid, preliminary analysis of data. Tables and graphs of results can be professionally formatted, rapidly and easily brought into presentation tools such as PowerPoint.
- Custom functions can be programmed using macros programmed in VBAVisual Basic for Applications.
Limitations and Data Requirements
- Native capability statistical tests do not include nonparametricStatistical test that does not depend on knowledge of the distribution of the sampled population (Unified Guidance). methods or several other statistical tests (for example, prediction limitsIntervals constructed to contain the next few sample values or statistics within a known probability (Unified Guidance).).
- No groundwater-tailored evaluation functions are offered in native Excel.
- Previous versions of Excel (for example, Excel 2007) produced erroneous results for some statistical procedures.
- Distributions are not computed with precision.
- Excel has limited accuracy with very large and very small numbers. The precision is confined to 15 significant figures, which may be further limited by rounding off and binary storage.
- Ranks of tied data are nonstandard.
- No record is made of the process to results.
- Historical compatibility issues with macros programmed in earlier versions of the software. Microsoft has a history of changing how custom buttons and objects are handled in applications built on the Excel platform.
References
Burns Statistics, http://www.burns-stat.com/, (click on tutorials, spreadsheet addiction).
Goldwater, A. 2007. Using Excel for Statistical Data Analysis – Caveats. Biostatistics Consulting Center, University of Massachusetts School of Public Health.
Heilberger, R. M., and E. Neuwirth. 2009. R Through Excel: A Spreadsheet Interface for Statistics, Data Analysis and Graphics. London: Springer Dordrecht Heidelberg.
MCCullough, B. D., and D. A Heiser. 2008. “On the Accuracy of Statistical Procedures in Microsoft Excel 2007,” Computational Statistics and Data Analysis 52: 4570-4578.
Practical Stats, http://www.practicalstats.com/xlsstats/excelstats.html.
Yalta, A. T. 2008. “The Accuracy of Statistical Distributions in Microsoft Excel 2007,” Computational Statistics and Data Analysis 52: 4579-4586.
Publication Date: December 2013